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^ Abstract 

We consider a generalization of the Squared Euclidean Facility Location Problem, 
i-Q when the distance function is a squared metric, which we call Squared Metric Facility 

Location Problem (SMFLP). We show that there is no approximation algorithm with 
factor better than 2.04 for the SMFLP, assuming P ^ NP. We analyze the best known 
algorithms for the Metric Facility Location Problem (MFLP) based on primal-dual 
and LP-rounding techniques when they are applied to the SMFLP. We prove very 
Xy^ tight bounds for these algorithms, and show that the LP-rounding algorithm achieves 

Q a ratio of 2.04 and therefore is the best possible for SMFLP. Also, we propose a new 

technique to systematically bound factor-rcvcaling programs, and use it in the dual- 
O fitting analysis of the primal-dual algorithms for both the SMFLP and the MFLP, 

simplifying and improving some of the previous analysis for the MFLP. 
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1 Introduction 



Let C and F be finite disjoint sets. We call cities the elements of C and facilities the elements 
of F. For each facility i and city j, let Cij be a non-negative number representing the cost 
to connect i to j. Additionally, let fi be a non-negative number representing the cost to 
open facility i. For each city j and subset F' of F, let c{F',j) = miujgi?/ Cjj. The Facility 
Location Problem (FLP) consists of the following: given sets C and F, and c and / as 
above, find a subset F' of F such that I]jeF' fi + Z^jec j) is minimum. Hochbaum [9] 
presented an 0(logn)-approximation for the FLP. Archer [1] showed that this result is 
asymptotically tight, unless NP C DTIME[n*-*('°s^°s"^], by presenting a reduction from the 
Set Cover problem. 

A well-studied particular case of the FLP is its so called metric variant. We say that 
an instance (C, F, c, /) of the FLP is metric if Cij < Ciji + Ci'f + Ci'j, for all facilities i 
and i', and cities j and j'. In the context of the FLP, this inequality is called the triangle 
inequality. The Metric FLP, denoted by MFLP, is the particular case of the FLP that 
considers only metric instances. Several algorithms were proposed in the literature for the 
MFLP [3, 5, 7, 11, 14, 15, 17, 18]. In particular, the best known algorithm for the MFLP is 
a 1.488-approximation proposed by Li [15]. Also, there is an inapproximability result that 
states that there is no approximation algorithm for the MFLP with a ratio smaller than 
1.463, unless NP C DTIME[nO('°s^°e")] [8]. This last resuh was strengthened by Sviridenko, 
who showed that the lower bound holds unless P = NP [19]. 

The Euclidean FLP is a particular case of the MFLP also considered in the literature. 
In the Euclidean FLP, one is given a position in an Euclidean space for each city and for 
each facility, and the cost Cij is the Euclidean distance between the position of facility i and 
the position of city j. There is a PTAS for the Euclidean FLP in 2-dimensional space, by 
Arora, Raghavan, and Rao [2]. 

Yet another variant considered in the literature is the so called SQUARED EUCLIDEAN 
FLP, denoted here by E^FLP. In this variant, as in the Euclidean case, one is given a 
position in an Euclidean space for each city and for each facility. Here, the cost Cij is the 
square of the Euclidean distance between the position of facility i and the position of city j. 
This cost measure is known as and was, for instance, considered by Jain and Vazirani [14, 
pp. 292-293] in the context of the FLP. Their approach implies a 9-approximation for the 
E^FLP. 

We consider instances (C, F, c, /) of the FLP such that a relaxed version of the triangle 
inequality is satisfied. We say that a cost function c is a squared metric, if, for all facilities i 
and i', and cities j and j', we have y/Qj < y/Cif + y/Ci^ + The particular case of 

the FLP that only considers instances with a squared metric is called Squared Metric 
FLP, and is denoted by SMFLP. Notice that the SMFLP is a generalization of the E^FLP 
and of the MFLP. Thus any approximation for the SMFLP is also an approximation for 
the E^FLP or the MFLP, and the inapproximability results for the MFLP are also valid 
for the SMFLP. The 9-approximation of Jain and Vazirani [14] applies also to the SMFLP 
and, to our knowledge, it has the best previously known approximation factor. The choice 
of squared metrics discourages excessive distances in the solution. This effect is important 
in several applications, such as /c-means and classification problems. 

Although there are several algorithms for the MFLP in the literature, there are very 
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few works on the SMFLP. Nevertheless, one may try to solve an instance of the SMFLP 
using good algorithms designed for the MFLP. Since these algorithms and their analysis 
are based on the assumption of the triangle inequality, it is reasonable to expect that they 
generate good solutions also for the SMFLP. However, there is no trivial way to derive an 
approximation factor from the MFLP to the SMFLP, so each algorithm must be reanalyzed 
individually. In this paper, we analyze three primal-dual algorithms (the 1.861 and the 1.61- 
approximation algorithms of Jain et al. [11], and the 1.52-approximation of Mahdian, Ye and 
Zhang [17]) and an LP-rounding algorithm (Chudak and Shmoys's algorithm [6] used in the 
1.5-approximation of Byrka and Aardal [3]) when applied to squared metric FLP instances. 
We show that these algorithms achieve ratios of 2.87, 2.43, 2.17, and 2.04 respectively. The 
last approximation factor is the best possible, as we show a 2.04-inapproximability limit for 
the SMFLP. This was obtained by extending the hardness results of Guha and Khuller [8] 
for the metric case. 

The original analysis of the three primal-dual algorithms are based on the so called 
families of factor-revealing linear programs [11, 17]. The lower bound on the approximation 
factor is given by a computer calculated solution of any program in this family. The upper 
bound, however, is obtained analytically by bounding the value of every program in this 
family, which requires long and tedious proofs. In this paper, we propose a way to obtain 
a new family of upper bound factor-revealing programs for the SMFLP, as an alternative 
technique to achieve an upper bound. Now, the upper bound on the approximation factor 
is also obtained by a computer calculated solution of a single program. We note that, in 
our case, the factor-revealing programs are nonlinear, since the squared metric constraints 
contain square roots. We tackle this by replacing these constraints with an infinite set of 
linear constraints. 

Recently, we became aware of the strongly factor-revealing linear programs, proposed by 
Mahdian and Yan [16]. Our upper bound factor-revealing program is similar to a strongly 
factor-revealing program. The techniques involved in obtaining our program, however, are 
different. To obtain a strongly factor-revealing linear program, one projects a solution of 
LP{md) in LP{m), and tries to adjust the restrictions to obtain a feasible solution. In our 
approach, we define a candidate dual solution for LP{k) from a fixed number of variables, 
and obtain an upper bound factor-revealing program directly in the form of a minimization 
program. For the case of the SMFLP, we observed that calculating the dual upper bound 
program is easier than projecting the solutions on the primal. Also, we have considered the 
case of the MFLP, for which the obtained lower and upper bound factor-revealing programs 
converge. 

Our contribution is two-folded. First, we make an important step towards generalizing the 
squared Euclidean distance and successfully analyze this generalization in the context of FLP. 
Second, we propose a new technique to systematically bound factor-revealing programs, and 
use it in the dual-fitting analysis of the primal-dual algorithms for both the SMFLP and the 
MFLP. 
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2 Preliminaries 



Observe that the E^FLP (and therefore the SMFLP) is not a particular case of the MFLP. 
For example, consider an instance of the E^FLP consisting of two facilities i and i' at positions 
(0, 0) and (0, 2), and two cities j and f at (0, 1) and (0, 3). Thus, the cost function c is such 
that Cijf = 9, and Cjj = Cj/j = Cj/j/ = 1, so c does not satisfy the triangle inequality. 

Although the constraints over the cost function c from an SMFLP instance are defined 
by square roots, they are convex. Indeed, the next lemma shows that a squared metric can 
be expressed by an infinite set of linear inequalities. As a consequence, for any cost function 
not satisfying the squared metric inequality, there exists some linear inequality, as defined 
in Lemma 1, that is violated. 

Lemma 1. Let A, B, C, and D be non-negative numbers. Then \/A < \/B + \/C + if 
and only if A < (1 + /3 + + (1 + 7 + |)C + (1 + 5 + for every positive numbers (5, 

7, and 6. In particular, if \fA < \fB + \fC + \fD , then A < 35 + 3C + 3L). 

Proof. Suppose < + + /D. As {^/W - \^VPf > 0, we have that < 
(3B + D/13. Similarly, < -fC + B /-f and < 5D + C/6. Therefore, if < 

VB + VC + /D, then 

A < {Vb + Vc + VdY 

= B + C + D + 2yl3D + 2VCB + 2VdC 

< B + C + D + (3B + D/(3 + 7C + 5/7 + 6D + C/S 

= (l + /3 + l)i? + (l + 7 + l)C + (l + 5 + ^)D. 

Choosing /3 = 7 = 5 = 1, we obtain A < 3B + 3C + 3D. 

Now suppose > ^/B + ^/C + \fD. Let c/ > be such that A = B \C \D \ 2\fBD + 
2^fCB + 2^fDC + d. Then, A > (1 + /3 + i)S + (1 + 7 + |)C + (1 + 5 + is equivalent 

to {ii + ^)B + (7 + \)C + (5 + \)D < 2y/BD + 2^fCB + 2^fDC + d. We will analyze the 
cases in which none, one, two or all numbers 5, C and D are zero. Let ^ and ^' be positive 
numbers such that ^ + ^' < 1. 

Case 1: fi, C, D > 0. Let /3 = , 7 = and 5 = • Then (^S + + (7 + \)C + 
(5 + = 2^fBD + 2^fCB + 2^fDC < 2^BD + 2\fCB + 2^/DC + d. 

Case 2: = and C, > 0. Let ;g = g, 7 = ^ and 5 = • Then (^S + ^)B + (7 + 
|)C + (5 + = + + f )d < 2^BD + + 2^fDC + rf. 

Case 3: 5, C = and D > 0. Let /3 = g, 7 = 1 and 5 = Then + i)5 + (7 + |)C + 
(5 + i)D = + f )d < 2v/5D + 2^/CB + 2v^ + d. 

Case 4: fi, C, D = 0. Let /3 = 1, 7 = 1 and 5 = 1. Then (^S + + (7 + |)C + (5 + i)D = 
{)<2^/BD + 2^/CB + 2^/DC + d. □ 
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We also use the concept of a bi-factor approximation algorithm, usually adopted in the 
context of the FLP. Consider an algorithm that generates a solution with possibly distinct 
approximation factors for facility and connection costs. A bi-factor approximation for the 
FLP, as defined by Mahdian, Ye and Zhang [17], is described in the following: 

Definition 2 (Bi-factor approximation algorithm [17]). An algorithm is called a {'jfj'jc)- 
approximation algorithm for the FLP if, for every instance I = (C, F, c, /) of the FLP, and 
for every solution 5 C F for I with facility cost f{S) = J2i(^s fi ^'^^ connection cost c{S) = 
J2j&c '^{^^ j) > cost of the solution produced by the algorithm is at most 7^/(5) + 7cc(S'). 

Observe that a (7/, 7c)-approximation algorithm for a problem is also a 7- approximation 
for the problem, where 7 = max(7j,7c). In some situations, one may take advantage of 
a more discriminative approximation ratio. For instance, a (1, Q;)-approximation for the 
MFLP is a 2a-approximation for the metric fc-median problem [13]. Jain et al. described 
such 2a-approximations for a = 3 [10] and a = 2 [11]. Mahdian, Ye and Zhang [17] used 
a scaling and greedy augmentation algorithm to balance a (1.11, 1.78)-approximation and 
obtain a 1.52-approximation for the MFLP. Byrka and Aardal [3] combined different bi- 
factor approximations to obtain a 1.5-approximation for the MFLP. 

For the MFLP, Jain et al. showed that no algorithm is a (7^, 7c)-approximation, with 
7c < 1 + 2e-^^ unless NP C DTIME[n°(i°si°s")] [11]. As the SMFLP is a generalization 
of the MFLP, these negative results apply also to the SMFLP. We extend these results by 
adapting the proof of Guha and KhuUer [8] to the SMFLP as follows: 

Theorem 3. Let 7/ and 7c be positive constants with 7c < 1 + 8e~'^f . If there is a (7/,7c)- 
approximation for the SMFLP, then P = NP. In particular, let a ~ 2.04011 be the solution 
of equation 7 = 1 + Be"''', then there is no a' -approximation with a' < a for the SMFLP 
unless P = NP. 

Proof (adapted from [8]). For simplicity, here we show that the lower bound holds unless 
NP C DTIME[n°(^°g'°s'^)]. If we follow the lines of Sviridenko (see Vygen [19, Section 4.4]), 
the condition is changed to unless P = NP. 

Assume A is a (7/, 7c)-approximation for the SMFLP with 7c < l+Se"'''^. Let J = {U, S) 
be an instance of the Set Cover, with lA being a set of elements, S a collection of subsets of U 
and n= \U\. We will derive a (rf' In n)-approximation algorithm for the Set Cover problem, 
for some d' < 1. 

Let k be the optimal value of J . If k is not known, one can run this algorithm for 
k = 1, . . . , n and output the best solution found. 

The algorithm will find a solution for J by iteratively solving a sequence of instances of 
the SMFLP of the form X^^) = (C^), F, c, Z^^)), where F = 5 and the initial set C(i) = U. 
For each element Xj G Si, set Cjj = 1, and for each Xj ^ Si, set Cjj = 9. Note that such c is 
a squared metric. Let Uj = \C^^^\. In the jth instance, every facility cost is f^^^ ~ 7^' ^'^^ 
some positive 7 to be fixed later. For each j, let S^^^ denote the solution for X'^^^ produced 
by algorithm A and let C^^~^^^ be the elements of C^^^ not covered by any set in S^^\ This 
process stops when C^^^^^ = and yields the solution 1) U • • • U S^^^ for J. 

Observe that an optimal solution for is a solution for each I^^^ with total facility cost 
k f^^^ and connection cost one for each of the nj cities. Therefore, S^^^ has cost at most 
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Ifkf^-'^ +10^] = (7/7 + 7c)nj, because f^^^ = 'j"^. Let (3j = \S^^'^\/k and dj be such that djUj 
is the number of elements covered in iteration j, that is, the number of elements of C^^^ in the 
union of the sets in S^^\ Then the total facility cost of S^^'^ is /3jkf^^^ = (^j'ynj. Moreover, djUj 
cities are connected with cost one and the other rij — djUj = (1 — dj)nj cities are connected 
with cost nine. Hence the total cost of S^^^ is /3j'ynj + djUj + 9(1 — dj)nj = {/3j'j + 9 — 8dj)nj. 
We conclude that 7/7 + 7c > /3j7 + 9 — 8dj. So we have that 7c > {f3j — 7/)7 + 9 — 8dj. 

Let d < 1 he such that 1 + 8e~'''f^'^ > 7c. Suppose, for the sake of contradiction, that 
dj < 1 — e"^-'/'' for some j. Then 

7c > (/3,-7/)7 + 9 -8(1-6-^^/'^). 

Considering 7/, 7 and d fixed, the minimum value of the right hand side is achieved when 
/3j = (iln Substituting (3j above, we get 

g 

7c > (rfln — - 7/)7 + 1 + ^7. 

Considering d and 7/ fixed, we choose the value of 7 that maximizes the right hand side, that 

is 7 = d . Replacing in the inequality, we obtain 7c > 1 + 8 e ^ > 7c, a contradiction. 

So dj > 1 — e~^^/'^ for every j, for this d <1. 

Following the lines of Guha and KhuUer [8], one can prove that the algorithm described 

above for Set Cover is a (d' In ?T,)-approximation for some d' < 1. This implies that NP C 
DTIME[n°(^°siogn)]_ □ 

3 A new factor-revealing analysis 

The metric property is used only in the analysis of the algorithms of Jain et al. [1 1] . In other 
words, their algorithms can be applied to general FLP instances. However, the performance 
guarantee is only proved to hold for the MFLP instances. In this section, we will prove that 
their first algorithm, that we denote by Al, is a 2. 8 7- approximation for the SMFLP. For the 
sake of completeness, their algorithm is described next. 

Algorithm Al {C,F,c,f) [11] 

1. Set U := C, meaning that every facility starts unopened, and every city unconnected. 
Each city j has some budget aj, initially 0, and, at every moment, the budget that an 
unconnected city j offers to some unopened facility i equals to max(aj — Cij, 0). 

2. While t/ 7^ 0, the budget of each unconnected city is increased continuously until one 
of the following events occur: 

(a) For some unconnected city j and some open facility i, aj = Cij. In this case, 
connect city j to facility i and remove j from U. 

(b) For some unopened facility i, J2jeu'^^^{(^j ~ Cij,0) = fi. In this case, open 
facility i and, for every unconnected city j with aj > Cij, connect j to i and 
remove it from U. 
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The analysis presented by Jain et al. [11] uses the dual fitting method. That is, their 
algorithms produce not only a solution for the MFLP, but also a vector a = (ai, . . . ,a\c\) 
such that the value of the solution produced is equal to Moreover, for the first 

algorithm, following the dual fitting method, Jain et al. [11] proved that the vector a/1.861 
is a feasible solution for the dual linear program presented as (3) in [11], concluding that 
the algorithm is a 1.8 61- approximation for the MFLP. To present a similar analysis for the 
SMFLP, we use the same definitions and follow the steps of Jain et al. analysis. We start 
by adapting Lemma 3.2 from [11] for a squared metric. 

Lemma 4. For every facility i, cities j and j' , and vector a obtained by the first algorithm 
of Jain et al. [11] given an instance of the SMFLP, 

Proof. If aj < aj', the inequality obviously holds. So assume aj > aj'. Let i' be the facility 
to which the algorithm connects city j'. Thus aj> > Ci'j> and facility i' is open at time 
aj' < aj. If aj > Ci'j, then city j would have been connected to facility i' at some time 
t < max(aj/, Cj'j) < aj, and aj would have stopped growing then, a contradiction. Hence 
aj < Ci'j. Furthermore, by the squared metric constraint, ^cvj < + ^^5/ + ^/(^■ 

Therefore y/a] < y/Of + y/Cif + y/Cij- □ 

From Lemmas 1 and 4, we derive the following. 

Corollary 5. For every positive P, 7, and 5, and for every facility i, cities j and j' , and the 
vector a produced by the first algorithm of Jain et al. [11] given an instance of the SMFLP, 
a,<{l + (3 + ^)aj> + (1 + 7 + + {1 + 6+ \)cij. 

A facility i is said to be ^-overtight for some positive 7 if, at the end of the algorithm, 

Emax(^-Q,-,0) < (1) 
3 ^ 

Observe that, if every facility is 7-overtight, then the vector 0/7 is a feasible solution for the 
dual linear program presented as (3) in [11]. Jain et al. proved that, for the MFLP, every 
facility is 1.861-overtight. We want to find a 7 for the SMFLP, as close to 1 as possible, for 
which every facility is 7-overtight. 

Fix a facility i. Let us assume without loss of generality that aj > 'ycij only for the 
first k cities. Following the lines of Jain et al. [11], we want to obtain the so called factor- 
revealing program. We define a set of variables /, dj, and aj, corresponding to facility cost 
fi, distance Cij, and city contribution aj. Then, we capture the intrinsic properties of the 
algorithm using constraints over these variables. We assume without loss of generality that 
tti < ■ ■ ■ < «/c- Also, we use Lemma 3.3 from [11], that states that the total contribution 
offered to a facility at any time is at most its cost, that is, I]f=j niax(aj — di,0) < f. Besides 
these, we have the inequalities from Lemma 4. Subject to all of these constraints, we want 
to find the minimum 7 such that the facility is 7-overtight. In terms of the defined variables. 
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we want the maximum ratio Y,'j=i ^^j/if + Z]j=i '^j)- obtain the following maximization 
program: 



zt^ = max — - 



k 

a, 



.7 = 1 



s.t. aj < Oj+i V 1 < j < A; 

< y/cn + ydj + \fdi V 1 < j, / < A; 



(2) 



Ef=j max(aj - c/;, 0) < / V 1 < j < A; 
aj,djj>0 yi<3<k. 

The next lemma has the same statement of Lemma 3.4 in [11], but it refers to program (2). 
Since the proof is the same, we omit it. 

Lemma 6. Let 7 = supfc>i z^^ . Every facility is 'y-overtight. 

Therefore sup^>]^ z^^ is an upper bound on the approximation factor of the algorithm for 
the SMFLP. A slight modification of the example presented in Theorem 3.5 of [11] shows 
that this upper bound is tight. 



3.1 A first analysis 

Our first step is to relax (2) into a linear program. For that, we adjust the objective 
function as in [11], and we approximate the squared metric property by using inequalities 
given by Corollary 5. For simplicity, here we will use only the inequalities corresponding to 
/3 = 7 = (5 = 1. With this, we will prove that sup^>]^ z^'^ is not greater than 3.236. Later, we 
will improve the obtained result by using a whole set of inequalities from Corollary 5, and 
using a more standard factor-revealing analysis for the SMFLP. The first relaxed factor- 
revealing linear program is: 

= max Y.'j=i «j 

s.t. / + E-=irf, <1 

aj < aj+i W 1 < j < k 

aj < 3ai + 3dj + 3di V 1 < j, / < (3) 

Xji > aj — di ^ < 3 <l <k 

EUx,i<f yi<j<k 

aj,dj,f,Xji > V 1 < j < A;. 

As (3) is a relaxation of (2), we have that z^^ < Wj^ and thus an upper bound on sup;j>]^ 
is also an upper bound on supfc>i Solving linear program (3) using CPLEX for k = 540, 
we obtain the next lemma. 



Lemma 7. supfc>iW^ > 3.220. 

To obtain an upper bound to their factor-revealing linear program, Jain et al. [11] pre- 
sented a general solution to the dual of their factor-revealing linear program. This dual 
solution is deduced from computational experiments and empirical results for small values 
of k. First, they relaxed each constraint corresponding to the cities contribution, that is, 
I]/=jmax(aj — di,0) < /, obtaining Y.i=j{oij ~ di) < /, for some value Ij estimated compu- 
tationally. Then, they multiplied each of these inequalities by a variable 9j and added them 
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up, combining them with the other inequahties to obtain a bound on the optimal value. 
Variables 9j play the role of dual variables. They also argue that doubling (and scaling 
down) a solution of their factor-revealing linear program for k gives rise to a feasible solution 
for 2k with the same value. Thus one can obtain an upper bound on the optimal value of 
this linear program assuming k is large enough. 

We have replayed the analysis of their algorithm for the SMFLP. In their approach, they 
defined Ij and 9j as 2- and 3-step functions specified by parameters pi and p2- Experimentally 
one can see that such choices are natural. Then, with straightforward calculations, they 
obtained an upper bound as a function of pi and p2, and adjusted these parameters to 
prove the best possible bound on the approximation factor. Using a similar approach for 
the squared metric case, that is, using step functions for Ij and 9j with small number of 
steps, we obtained a factor not better than 3.625 for SMFLP. We managed to improve the 
obtained factor to 3.512 by using a piecewise function for 6j whose pieces are either constant 
or hyperboles on j. Different choices of functions Ij and 9j lead to different approximation 
guarantees. We have tried several choices based on empirical observations. For instance, 
observing the primal general solution may suggest that defining Ij and 9j as 3- and 4-step 
functions is a good choice, but this does not improve the 3.625 bound. 

Inspired on this process, we looked for an alternative analysis of program (3). Instead 
of defining explicitly the value of each 9j, we multiply each inequality of this program by 
variables and add them up. Then, we try to adjust the value of these variables to obtain 
the best upper bound. This is done through a linear program subject to some desired con- 
straints, with the objective of achieving the smallest upper bound on the approximation 
factor. Unfortunately, this linear program can be arbitrarily large. We deal with this situa- 
tion by choosing an appropriate value of k and exploiting a special property of program (3). 
The next lemma shows that w^, does not decrease for multiples of k. 

Lemma 8. For every k and every t, w^. < w^j.. 

Proof. It is enough to make t replicas of an optimal solution for fc, and then scale down the 
variables by 1/t, to obtain a feasible solution of the linear program for tk with objective 
value w^f^ = ti'fc. □ 

In the next lemma, we use a linear program to give a very tight bound on sup;j>i Wj^. 

Lemma 9. For every k, Wj. < 3.236. 

Proof. In what follows, we deduce an upper bound on Wj. by deriving a linear minimization 
program whose feasible solutions are upper bounds on w^. Then we present a feasible solution 
of value less than 3.236 for this program. The idea is to determine a conical combination 
of the inequalities of (3) that imply inequality (1) for a 7 as small as possible. The linear 
minimization program will help us to choose the coefficients of such conical combination. 

Let us rewrite the third inequality of program (3), so that the right-hand side is zero. 
For each j and I, we multiply the corresponding inequality by ipji. Denote by A the sum of 
all these inequalities, that is, 

k k 

(Pji{aj - 3ai - 3di - 3dj) < 0. 

j=i 1=1 
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The fourth and fifth inequahties of program (3) can be relaxed to the set of inequalities 
J2i=j{o:j — di) < f, one for each Ij such that j < Ij < k. For each j and Ij, we multiply the 
corresponding inequality by Oji^ and denote by B the inequality resulting of summing them 
up, that is, 

The coefficients of aj in A and B are, respectively, 

k k 

coeSA[aj] = E(v^i^ ~ ^v^/j) and coeSslaj] = E(^ " ^ + 
1=1 i=j 

and the coefficients of —dj in A and B are, respectively, 

k j k 

coeSA[-dj] = 3{(pji + ipij) and coeSsi-dj] = E E 

1=1 i=l l=j 

Now, we sum inequalities A and B and obtain a new inequality C: 

k k 

coeffc[aj] aj - ^ coeSc[-dj] dj < coeffc[/] /. (4) 
i=i j=i 

We want to find values for 7, 6ji, and (fji so that the corresponding coefficients of C are 
such that inequality (4) implies, for sufficiently large k, that 

k k 

E «i - 7 E ^ 7/- (5) 

i=i i=i 

Moreover, we want 7 as small as possible. To obtain inequality (5) from inequality (4), it 
is enough that, for each j, coefficient coeffc[aj] > 1, coeSc[—dj] < 7, and coeffcf/] < 7- 
Hence, this can be expressed by the following linear program. 

yk = min 7 

s.t. coeffcK] > 1 Vl<j<fc 

coeSc[-dj] < 7 V 1 < J < A; , . 

coeffc[/] < 7 ^ ^ 

ipji>0 V 1 < j, / < 

Oji>0 V 1 < j < / < fc. 

The interested reader may observe that this program is the dual of a relaxed version of 
the factor-revealing linear program (3). Therefore, its optimal value is an upper bound on 
the optimal value of (3). 

Using Lemma 8, we may assume that k has the form k = pt with p and t positive integers. 
We will use a scaling argument to create a linear minimization program with a small number 
of variables, and obtain a feasible solution for program (6) from a solution of the former 
program. Then, we will show that the value of the generated solution is bounded by the 
value of the small solution. 
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Consider variables 7' G IR+, ip'^i G IR+ for 1 < j,l < t, and 9'ji e R+ for 1 < j < I < t. 
For an arbitrary n, let n = [^]. We obtain a candidate solution for program (6) by taking 



ifji = = ^ and 7 = 7. (7) 



Let us calculate each coefficient of C for this solution. 

k k 

coeffc[aj] = - 3(^/j) + - j + 1)0^/ 

k k e'-.j 

= 5:(^-3^) + E(^-j + l)^ 

tl p p p' 

t 



= Jlpi-f - 3^) YXpi' - ^ - pj) 

>E^.-3^y+ E (^'-3-2)^5.- 



j A: 

coeffc[-rfj] = E 3(</'ji + '^13) + E E 

/=1 i=l /=j 

<T.p-^(^ + -^) + T.p-T.p--^ 

p p p^ 



=3 



i'=i f i'=i I' 

E3H,+^y + EE^:'r- 

l'=l i'=l ii=j 



k k pt pt 0' ^ t t Ql t t 

coeffc[/] = EE^.^ = EEi<Ep-Ep-^ = EE • 

j=l l=j j=l l=j P j'=l i>=j P j'=l v=j 

Now, we want to find the minimum value of 7' and values for Lp'^i and O'^i such that the 
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candidate solution for program (6) is feasible. We may define the following linear program. 
= min 7' 

s.t. ElMi - Mj) + llU+S - J - '^)o,i > 1 V 1 < J < t 
ELi Wji + + ELi EL, 0a <i V 1 < J < t 

>o vi<j,/<t 
e'ji>o V 1 < J < / < t. 

Consider an optimal solution for program (6). Replacing it in (4), that is, in inequality 
C, we obtain I]j=i ctj ~ 7Ej=i'^j < if- Thus, < 7 = yt- Now, consider an optimal 
solution for program (8) and the corresponding generated solution for program (6). We 
obtain ?/fc < 7 = 7' = and conclude that W)^ < x^. 

Using CPLEX to solve program (8), we obtained i^ggo ~ 

3.23586 < 3.236, and this 

concludes the proof. □ 



3.2 An improved factor-revealing analysis 

In Lemma 9, we obtained the minimization program (8) from a conical combination of con- 
straints from program (3). The optimal value of this minimization program is an upper 
bound on the approximation factor. The calculations involved are very similar to those used 
to obtain the corresponding dual program. We propose a new standard factor-revealing tech- 
nique, which provides a more straightforward way to obtain a bound on the approximation 
factor. 

Consider the traditional maximization factor-revealing linear program. First, the dual 
program is obtained for some k. Take k in the form k = pt, for a fixed t. We will create a 
minimization program that mimics the constraints of the dual, but depends only on t and 
bounds its optimal value for every k. The idea is to constrain the variables of the small 
program to obtain a feasible solution for the dual program. To obtain a linear program 
independent of k, we will scale the variables by p. Since any solution of such a program 
reveals an upper bound on the approximation factor, we call it an upper bound factor- 
revealing program. 

In order to derive a better upper bound factor-revealing linear program, we will use 
a whole set of linear inequalities to approximate the nonlinear constraint in (2). Consider 
tuples (/3i,7j,5i) of positive real numbers and Bi = l-\-f3i + ^, Ci = l + ji + j:, Di = l + Si + j^ 
for 1 < i < m. Using Corollary 5, we insert inequalities corresponding to the given tuples, 
replacing the nonlinear constraint, and obtain z^^ < w^^, where w^^ is given by 



w^^ = max Ej=i 



a 



j 



s.t. / + Ef=ic?, <1 

aj < aj+i W 1 < j < k 



a J ^ 



^ < Biai + ddj + D,di V 1 < j, / < fc, 1 < i < m (9) 
Xji > ttj — di V 1 < j < / < 

EUxji<f yi<j<k 

aj,dj,f>0 yi<J<k. 
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The following lemma gives a lower bound on the approximation factor of the algorithm 
for the SMFLP. 

Lemma 10. sup^,>]^ z-^'^ > 2.86. 

Proof. Although program (2) contains nonlinear constraints, we may use linear program 
packages to solve it. We start by solving program (9) with a fixed number of inequalities. 
Then, we employ a cutting plane insertion strategy: if the obtained solution violates the 
squared metric property, we derive a cutting plane using Lemma 1, and resolve the linear 
program with this additional constraint. Using CPLEX with the cutting plane strategy we 
obtained zf^^ ^ 2.86099 > 2.86. □ 

Now, we can bound the approximation factor of the algorithm using an upper bound 
factor-revealing program. 

Lemma 11. supfc>i 2;^^ < 2.87. 

Proof. It is easy to see that the proof of Lemma 8 is also valid for program (9). So, from 
Lemma 1, we have that z^^ < w^l, for every positive integer t. We assume that k has the 
form k = pt, with p and t positive integers. The dual of the linear program (9) is 

Wj: = mm 7 

m k m k k 

s.t. aj - aj_i + E E Cjii - J2 BiJ2 Qji + E eji > 1 

j=i 1=1 1=1 1=1 i=j 

m k m k j 

T,CiT, Cjii + T, DiJ2 ciji + E eij < 7 

i=l 1=1 1=1 1=1 1=1 

i=i 

eji < hj 

ao = ak = 0, ttj, hj, Cji, Cju > 

We can derive the upper bound factor-revealing linear program. We would like to define 
variables as in equation (7). Just using a scale factor is not sufficient to preserve the vari- 
ables Oj in program (10). The variables aj correspond to the ordering restrictions of primal 
variables Oj in program (9), and computational experiments have indicated that removing 
such restrictions does not change the optimal value significantly, for large values of k. So, we 
could just set ttj = for all j. However, we want to preserve such restrictions, as they will 
shortly be needed to prove Lemma 13. To do this, we can interpolate the variables of the 
upper bound factor-revealing program to obtain the variables of the lower bound program. 
Again, we group sets of variables based on their indices. For that, we denote the group of a 
variable of index n as n. Let h = \'^] and consider variables 7', a'j, dj^, e'j^, h'j. We obtain a 
candidate solution for program (10) by defining 

7 = 7', aj =pa''.{pj - j){a''. - aj_J, Cju = Cji = ^, and hj = (11) 

In the following, we will use definition (11) to obtain a candidate solution for program (10) 
from a small set of prime variables. Then, for each constraint of program (10), we obtain the 



V l<j<k 

V l<3<k 

(10) 

V l<3<l<k 

y l<jj<k 

1 < i < m. 
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expression formed by the dependent terms, and calculate it as a function of the considered 
variables. Notice that there is an expression for each primal variable of program (9). These 
expressions are analogous to the primal variables coefficients used in Lemma 9, thus, for each 
primal variable x, we say that this is the coefficient expression for x, and we will denote it 
by coeff[a;]. 

Now we create the minimization upper bound factor-revealing program. The objective 
value is obtained by applying definition (11) to the objective value of program (10). Then, for 
each group of coefficient expressions that has the same value, we include a constraint in the 
upper bound program that bounds the expression by the independent term. Notice that each 
upper bound factor-revealing linear program constraint may correspond to an arbitrarily 
large number of constraints of the factor-revealing linear program. In the following, we 
calculate and bound each coefficient expression. 

First notice that aj — aj^i = a'-. — To see this, it is enough to use definition (11) 

and consider the cases j = {j — 1), and j = {j — 1) + 1. Now we have: 

m k m k k 

coeff[aj] = aj - aj^i + -Y^^^Y. + E ^i/ 

i=i i=\ i=\ i=\ i=j 

m pt m pt pt g'_^ 

i=l 1=1 i-i 1=1 F i=j F 



jl'i \ - r) \ " ^ I'ji I \ " ^ jl' 



i=l V=l P 1=1 l'=l P l'=j+l 
m t m t t 



«i - + E E c'jiH - E E + E 4;' - ^• 



i=i i'=i 1=1 i'=i 



i'=j+i 



m k m k J 

coeS[dj] = 7 - ^ Q ^ Cjii - E E - E ^Ji 

i=l 1=1 i=l 1=1 1=1 

m pt (J^. m pt j g',, 

= 7'-Ec.E^-EAE^-E^ 

m * c' ™ * c' - c' 

>l'-YC^Yp—-YD.Yp—-Yp — 

i=l l'=l P 1=1 l'=l P l'=l P 

m t m t j 

= 7' - Ec^ E - E A E - E 4. > 0. 

i=i i'=i i=i i'=i i'=i 



coeff[/]=7-E/^.=7'-E^ = 7'-E A=7 -E/^;'>0. 
j=i j=i P j'=i P j'=i 



coeff[s,7l = hj — Cji = — ^ > 0. 

p p 
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We notice that for each primal variable, the constraint for its coefficient expression is 
equivalent to the constraint of any other primal variable in the same group. For example, for 
any pair aj and ai, such that j = I, we need to add only one constraint to the upper bound 
factor-revealing program; therefore, we need only t constraints for this kind of primal variable. 
We remark that the constraint obtained for coeff [xjj] does not depend on p. Conjoining all 
different constraints, and fixing variables as a[ and a[ to zero, we obtain program (12). 

xf^ = min 7 

m t m t t 

s.t. aj - aj-i + E E Cjii - J2 BiJ2 ciji + E eji > 1 V I < j <t 
i=ii=i i=i 1=1 i=j+i 

m t m t j 

T,CiJ2 Cjii + E A E ciji + E < 7 V 1 < j < t 

1=1 1=1 1=1 1=1 1=1 (12) 

E /i, < 7 
i=i 

Cji <hj y I < J <l <t 

flo = Ofc = 0, ttj, hj, Cji, Cjii>0 V 

l<z< m. 

Now, we want to use Lemma 1 and choose a set of tuples (/3,7,5), so that the squared 
metric is minimally relaxed. To accommodate the premises of Lemma 1, we solve the dual 
of the upper bound factor-revealing LP, so we may use the same cutting plane strategy used 
in Lemma 10. The dual is given in the following. 

xf^ = max Ej=i Cij 

s.t. / + E5=i dj < 1 

ctj < V 1 < j < t 

dj < BiOi + ddj + Didi V 1 < j, / < t, l<i<m (13) 
Xji > aj — di y I < j < I < t 

Y!i=jX,i<f vi<j<t 

aj,dj, f,Xji >0 Vl<j, /<t. 

Using the cutting plane strategy with CPLEX we obtain x^qq ~ 2.8697 < 2.87. □ 

If we apply this analysis for the metric case, we obtain an upper bound factor-revealing 
program similar to program (13). The only difference is that, for the metric case, there are no 
coefficients Bi, Ci and Di. We use this modified linear program to tighten the approximation 
factor for the metric case. 

Lemma 12. For the MFLP, the approximation factor of Al [11] is between 1.814 and 1.816. 

Proof. Let be the optimal value of the lower bound factor-revealing program (5) in [11]. 
The corresponding upper bound factor-revealing program is: 
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X 



,A1 



max S5=i Qij 



s.t. /+E;=if^, <i 

ttj < Oj+i V 1 < j < t 

aj < ai + dj + di V 1 < j, / < t 

> Oij — di ^ i < j < l < t 

Y.Ux,i<f \/i<j<t 

aj,djJ,Xji>0 Vl<j,/<t. 



(14) 



Numerical computations using CPLEX show that z- 



Al 



1.81412 > 1.814, and that 

□ 



1000 



1000 



^ 1.81584 < 1.816. 



We notice that the only difference between the upper and lower bound factor-revealing 
programs is that the upper bound factor-revealing program does not contain the restrictions 
C(j — dj < Xjj for all j. We exploit the similarity between these programs to bound the gap 
between their optimal values. The following lemma is valid for both the metric and squared 
metric cases. 

Lemma 13. Let z^^ be the optimal value of the lower bound factor-revealing program (9) 
(program (5) in [11]) and let (Q;,d,x,f) be an optimal solution for program (13) (respectively 
program (14)^ with cost value If ^ = maXjjaj — dj}, then z^^ > j^x^^. 

Proof. Consider a candidate solution for the lower bound factor-revealing program formed by 
identical {a, d, x', f), such that f = f + e, x'^i = Xji if j ^ /, and x'^^ = aj — dj. Clearly, this 
solution has an objective value of and it violates only the first restriction of program (9) 
(program (5) in [11], respectively). For that, we get /' + J2^=idj = 1 + e > 1. Now, it is 
enough to multiply each variable by and obtain a feasible solution. □ 

From the last lemma, it is clear that the upper and lower bound factor-revealing programs 
yield close values, except for an error factor that depends only on the variable values of an 
optimal solution for the upper bound factor-revealing program. Since the optimal value 
decreases as the number of variables k increases, it is reasonable to expect that the value of 
both factor-revealing programs get very close as k tends to infinity. Indeed, for the metric 
case, it is easy to show that this error vanishes as k goes to infinity and, therefore, the upper 
bound and the lower bound factor-revealing programs converge to the same value, as k goes 
to infinity. 

Theorem 14. Let z^'^ be as in program (5) in [11] and let be as in program (14). Then 
suPfc>i^^' = inffc>ixf . 

Proof. First notice that, if we double a dual solution of program (14), then the obtained 
solution for the corresponding minimization upper bound factor-revealing is still feasible. 
Therefore, we may assume that k is arbitrarily large. Consider an optimal solution of 
program (14). We have that aj — dj < ai + di, for every j and /. Let j be such that 
e = aj — dj is maximum and add up these inequalities for all /. We get ke = k{aj — dj) = 
ELi(aj - dj) < Ef=i(a« + di) < xf + 1 < 1.816 + 1. From Lemmas 12 and 13, we get 
that > z^^ > xqi^^fc^ > 14-2 816/fc '^fc^' Taking the limit as k goes to infinity, we get that 
snp,>izf = mfk>ixf. □ 
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It would be nice to bound the values of the variables of program (13), as this would suffice 
to show that the factor-revealing programs also converge for the squared metric case. Since 
the coefficients of the squared triangle inequality involved in program (13) are all greater 
than one, we cannot use the same approach as in Theorem 14. Although experiments suggest 
that the value of variable in an optimal solution decreases as k increases, it does not seem 
trivial to determine whether vanishes when k goes to infinity. 

4 Analysis of the second algorithm 

In this section we analyze the second algorithm of Jain et al. [11] for the squared metric 
case. The algorithm is essentially the same as Algorithm Al, but each connected city keeps 
contributing to unopened facilities. The contribution of a connected city j to an unopened 
facility i is the budget the city would save if facility i were opened. The algorithm, that is 
denoted by A2, is described in the following. 

Algorithm A2 {C,F,c,f) [11] 

1. Set U := C, meaning that every facility starts unopened, and every city unconnected. 
Each city j has some budget aj, initially 0. At every moment, for each unopened 
facility i, if city j is unconnected, then j offers max(aj — Qj,0) to i, and, if city j is 
connected to facility i', then j offers max(cj'j — Cjj,0) to i. 

2. While t/ 7^ 0, the budget of each unconnected city is increased continuously until one 
of the following events occur: 

(a) For some unconnected city j and some open facility i, aj = Cij. In this case, 
connect city j to facility i and remove j from U. 

(b) For some unopened facility i, the total offer i receives from the cities equals the 
cost fi of opening i. In this case, open facility i, connect to i each city j with a 
positive offer to i, and remove each connected city from U. 

For the metric case, the approximation factor is 1.61. With a completely analogous 
reasoning, we obtain the corresponding factor-revealing program (15). The variables are 
the same as in program (2). The new variable rji corresponds to the budget aj if city j is 
connected at the same time as city /, or corresponds to the distance from j to the facility to 
which j is connected just before / is connected. 



A2 



subject to ttj < aj+i 
rji > rj,i+i 



maximize 




(15) 



l-l k 



max(rj7 — rfj, 0) + max(Q;i — dj,0) < f W 1 < I < k 
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We repeat the previous analysis to give lower and upper bounds on the approximation 
factor of the second algorithm for the SMFLP. 

Lemma 15. 2.415 < supfc>i < 2.425. 

Proof. First, we obtain an upper bound factor-revealing program. See details in Appendix A. 
This program is exactly the same as program (15), except the fourth constraint is replaced 
with 

l-l k 

max(rj7 — dj,0) + ^ max(ai — dj, 0) < /. 
i=i j=i+i 

Let be the optimal value of such a program. With CPLEX we get that z^qq ^ 2.41565 > 
2.415, and that x^^^ ~ 2.42473 < 2.425. □ 

Solving the upper bound factor-revealing LP obtained for the MFLP for k = 500 we may 
show that the approximation factor of A2 [11] is 1.602. The lower bound factor-revealing 
program and the maximization upper bound factor-revealing program are essentially the 
same, except for the extra terms of the kind max(a; — di). Therefore, Lemma 13 also holds 
for such programs. For the metric case, using a similar analysis to that of Theorem 14, one 
can show that the lower and the upper bound factor-revealing programs converge. 

Theorem 16. Let z^^ be as in program (25) in [11] and let x-^^ he the optimal value of the 
corresponding upper hound factor-revealing program obtained by removing the terms of the 
kind ma,x{ai — di) from the fourth restriction. Then snpi^^i z^^ = inffc>i x^^. 

5 Scaling and greedy augmentation 

Algorithm A2 can also be analyzed as a bi-factor approximation algorithm. The analysis 
uses a factor-revealing linear program, and is similar to the previous analysis. Mahdian, Ye 
and Zhang [17] observed that, due to the asymmetry between the approximation guarantee 
for the opened facilities cost and the connections cost. Algorithm A2 may be used to open 
facilities that are very economical. This gives rise to a two-phase algorithm, denoted here 
by A3{5), based on scaling the cost of facilities by a constant 6 > 1, and on the greedy 
augmentation technique introduced by Guha and KhuUer [7]. The first phase opens the 
most economical facilities, and the second phase greedily includes facilities that reduce the 
cost of the solution. 

Algorithm A3{5) (C, F, c, /) [17] 

1. Scaling: 

(a) Scale the facility costs by a factor 6. 

(b) Run Algorithm A2 on the scaled instance. 

2. Greedy augmentation: While there are facilities that reduce the total cost: 

(a) Compute the gain gi of opening each unopened facility i. 
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(b) Open a facility i that maximizes the ratio j;. 

In [17], a factor- reveahng hnear program is used to analyze Algorithm A3{6) using a 
somewhat different, but equivalent, greedy augmentation procedure. This was used to bal- 
ance a bi-factor from Algorithm A2 for the MFLP. As noticed by Byrka and Aardal [3], 
this analysis is not restricted to Algorithm A2, and applies to any bi-factor approximation 
for the FLP. Therefore, since it does not depend on the cost function being a metric, we 
can use it to balance a bi-factor approximation for the squared metric case. This result is 
precisely stated as follows. 



Lemma 17 ([17]). Consider a {'~ff,'yc)- approximation for the FLP. Then, for every 6 > 1, 

c 

5 



Algorithm A3{6) is a (7/ + \n5 + e,l + '^^^^)- approximation for the FLP. 



For the metric case, it has been shown that Algorithm A2 is a (1.11, 1.78)-approximation. 
This and Lemma 17 give a 1.52- approximation for the MFLP. For the SMFLP, we present 
an analysis based on an upper bound factor-revealing program. Using straightforward cal- 
culations, we may obtain the following: 

Lemma 18. Let 7/ > 1 5e a fixed value and let 7c = x^^" , where 



xt^" = max 



(16) 



s.t. ai < a^+i W 1 < I < k 

fji > Tj^i+i \/ I < j <l <k 

\fooi < ^yrji + -Jdi + V 1 < j < / < /c 

Yl, max(rj7 — c/j, 0) + Yl, niax(Q;; — dj,0) < f \/ 1 < I < k 
j=i j=i+i 

aj,dj,f,r,i>0 ^l<]<l<k. 
Then, if '~fc < 00, Algorithm A2 is a ('~ff,'~fc) -approximation for the SMFLP. 

The only difference between program (16) and the corresponding lower bound factor- 
revealing program is the extra term max(a; — d;, 0) in the lower bound program, which is not 
in the fourth constraint of program (16). Again, having a bound for this term is sufficient to 
show convergence of the upper and lower bound factor-revealing programs. For the metric 
case, this can be done easily. Notice that we may assume rji < aj, so, using a similar analysis 
to that of Theorem 16, one can show that, if Zk and Xk are solutions for the lower and upper 
bound programs respectively, then Xk — •jfS < Zk < Xk, for some e = 0(|). 

We observe that program (16) is unbounded for values of 7/ close to one. This happens 
also for the corresponding lower bound factor-revealing program. This is in contrast to the 
factor-revealing programs obtained for the metric case, for which we know that Algorithm 
A2 is a (1, 2)-approximation. In this case, the lower bound program is always bounded, but 
the upper bound program is unbounded for 7/ = 1, or for values close to one. It would be 
interesting to strengthen this upper bound factor-revealing program, so that it could also be 
used in the analysis also for 7^ = 1. 

Theorem 19. Algorithm A3 is a 2 A7- approximation for the SMFLP. 
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Proof. Consider program (16) for 7^ = 1.45. Numerical computations using CPLEX show 
that Xgoo ~ 3.40339 < 3.4034. From Lemma 18, we get that Algorithm A2 is a (1.45, 3.4034)- 
approximation for the SMFLP. Now, for 6 = 2.0543, Lemma 17 states that Algorithm A3 
is a (2.169 2.169 . . .)-approximation for the SMFLP. □ 



6 An optimal approximation algorithm 

Byrka and Aardal [3] (see also [4]) gave a 1.5-approximation for the MFLP combining a 
(1.11, 1.78)-approximation from Jain, Mahdian and Saberi [12] and a new analysis of the 
LP-rounding algorithm CS{'y) of Chudak and Shmoys [6], that leads to a (1.6774, 1.3737)- 
approximation. Byrka showed that 6*5(7) has the optimal bi-factor approximation (7, 1 -|- 
2e~'^) for 7 > 7o ~ 1.6774. By randomly selecting 7 according to a given probability distri- 
bution, Li [15] improved this result to 1.488, that is currently the best known approximation 
for the MFLP. 

We show that CS^j), when applied to the SMFLP, touches its optimal bi-factor approxi- 
mation curve (7, l-|-8e~'^) for 7 > 70 ~ 2.00492. Therefore, we have an (a, a)-approximation 
for the SMFLP, where a ~ 2.04011 is the solution of equation 7 = 1 -|- 8e~'^. Since a is 
the approximation lower bound, this result implies that CS{a), solely used, is an optimal 
approximation for the SMFLP. 

The natural linear program relaxation is given in the following: 

Klin J2i<^F Uifi + Sjec J2i£F XijCij 

s.t. "Y^i^F-^ij ~ 1 

Xij,yi > 

The corresponding integer variables t/i indicate whether facility i is open, and the cor- 
responding integer variables Xij indicate whether facility i serves city j in the solution. 
Algorithm CSIj) may be summarized as follows. First, a solution {x*,y*) of program (17) 
is obtained. Then, the fractional opening variables y* are scaled by a factor 7 > 1, = 'jy*, 
and variables Xij are defined so that city j is served entirely by its closest facilities, obtaining 
a new solution {x,y). We may assume that this solution is complete, i.e., for every city j 
and facility i, if Xij > 0, then Xij = y^, and that for every i, y^ < 1, since, in either case, 
we can split facility i, and obtain an equivalent instance with these properties. Finally, a 
clustering of some of the facilities is obtained according to a given criterion, and a proba- 
bilistic rounding procedure is used to obtain the final solution. For a detailed description of 
the algorithm, see [3] (also [4]). 

A facility i with Xij > is called a close facility of city j, and the set of such facilities is 
denoted by Cj. Similarly, a facility i with Xij = but x*j > is called a distant facility of j, 
and the set of such facilities is denoted by Dj. Let Fj = Cj UDj. The analysis of CS{'-f) uses 
the notion of average distance between a city j & C and a subset of facilities F' G F such 

that J2ieF' Pi > 0) defined as d{j, F') = ^fzi^^LEi^ ^ city j, we also use some definitions 
from [4]: the average connection cost, dj = d{j, Fj); the average distance from close facilities, 
df = d{j,Cj)] the average distance from distant facilities, d^^ = d{j,Dj); the maximum 



Vj e C 

Wi e F,j e C 

G F,j G C. 



(17) 
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distance from close facilities, d^^^^^ = maxigCj Qj; and the irregularity parameter pj, defined 
as pj = {dj — d^^^)/dj if dj > 0, and pj = otherwise. 

With these definitions, we may describe the clustering of the facilities. In each iteration, 
greedily select a city j, called the cluster center, such that the sum d^^^ + d^j^^^^ is minimum, 
and build a cluster formed by j and its close facilities Cj. Remove j and every other city j' 
such that Cj fl Cjf is not empty, and repeat this process until every city is removed. The set 
of facilities opened by CS{'y) is given by the following rounding procedure: for each cluster 
center j, open one facility i from Cj with probability Xij = y^, and, for each un clustered 
facility i, open it independently with probability y^. Each city is connected to its closest 
opened facility. 

The following lemma of Byrka and Aardal [3] is used to bound the expected connection 
cost between a city and the closest facility from a set of facilities. 

Lemma 20 ([3]). Consider a random vector y E {O,!}'"^' produced by Algorithm CS{'y), a 
subset A C of facilities such that J^i&AVi > 0, and a city j G C. Then, the following 
holds: 



E 



min ^ dj I ^yi>l 



ieA,y, 



<d{j,A). 



For a given city j, if one facility in Cj or Dj is opened, then Lemma 20 states that 
the expected connection cost is bounded by d^^^ and d^^'^\ respectively. If no facility in 
Cj U Dj = Fj is opened, then city j can always be connected to one of the close facilities Cji 
of the associated cluster center j' , with expected connection cost d{j,Cji \ Fj). Byrka and 
Aardal [3] showed that, for the MFLP, when 7 < 2, this cost is at most d^'^'' + d^^^^^ + d^j) . 
Since for the SMFLP we need 7 > 2, we will use an improved version of this lemma by 
Li [15]. The adapted lemma for the squared metric is given in the following. The proof is 
the same, except that we use the squared metric property, instead of the triangle inequality. 

Lemma 21. Let j be a city and j' be the associated cluster center such that Cj fl Cji 7^ 0. 
Then, d{j, Cy \ F,) < 3 • ((2 - 7)4""^ + (7 - 1)4'^ + ^^^^^ + df 



Proof. Let djji = min(cjj + Qj/), that is, the minimum connection cost of a path of length 

two from j to j'.* Fix a facility / such that Qj + q^/ = djji. For each facility i in Cj' \ Fj, we 
say that a path {j,l,j',i) is the center-path to i. The cost of the center-path to i is defined 
as djjf + Cij'. Notice that, using the squared metric property, Cij < 3{djji +Cij'), and therefore 



dU,C,,\F,) 



< 



Yji(iCy\Fj Cij ■ yi 

J2i£Cj,\Fj Hi 

^i€Cy\F, Kdjj' + Cjj') ■ Vi 

J2i<^Cj,\Fj Hi 

3-{d,,' + d{f,C,'\F,)). 



*In [15], the connection cost c is extended to a distance between j and j', and the triangle inequahty is 
then used to bound this distance with the connection cost of any path of length two. Here, we make a more 
explicit definition to avoid confusion, since the squared metric property is not sufficient for this purpose. 



20 



That is, d{j, Cj> \ Fj) is at most three times the average center-path cost. Following the lines 
of Li [15, Lemma 1], we know that djj/+d{j', Cj'\Fj) < (2— 7)^!, 
Therefore, the lemma holds. 



□ 

The next lemma follows from Lemma 21, and is straightforward. 

Lemma 22. d{j, Cj> \ Fj) < 3 (ydj + (3 - 7)4'^^) . 

Now, we can bound the expected facility and connection cost of a solution generated by 
CS {'-/). The next theorem is an adapted version of Theorem 2.5 from [4]. 

Theorem 23. For'y > 1, Algorithm CS{'~f) produces a solution (x, y) for the integer program 
corresponding to (17) with expected facility and connection costs 



E[yJ,]=j-F*, and E 



mm Ci 

i&F,yi=l ' 



< max < 1 + 8e 



5e-^ + e 

1-1 

7 



-1 



C. 



where F* = y*fi and C* = T.i<^F x*jCij . 

Proof. The expected cost of facility i is E[yifi\ = Vifi = 'J ■ y* ' fi = J ■ F* . 

If j is a cluster center, one of its close facilities is open, then the expected connection cost 
is 4'^'' < dj = C*. We may assume the j is not a cluster center. Let Pc be the probability that 
the closest facility to j is in Cj, and pd the probability that it is in Dj. If neither case occurs, 
then, with probability Ps = l—pc — pd, the closest facility is in Cj'\Fj, where j' is the cluster 



center associated with j. From definition, we have dj = (1 — pj)dj, dj = (1 + ■:!p:i)dj, and 
Pj < 1. Also, from [3], we know that Ps < e~"' and Pc > 1 ~ e^^. Combining these facts with 
Lemmas 20 and 22, we obtain 



E 



mm Ci 

ieF,yi=l ■ 



< Pc ■ df + pd ■ rff + p. ■ 3 (7^, + (3 - 7)rff ) 



{Pc +Pd + 9Ps) + 



{Pc +Pd + ^Ps)- {Pc + 



:i+8p.)(i-p,) + 



7-1 

{Pc + 3ps)7 



-pj dj 



7-1 
5ps + l 



Pj dj 



Pc 



Pj dj 



< (l + 8e-^)(l 



< max < 1 + Be 



, 5e-^ + e-i , ^ 
Pj) + 1 — Pj dj 



1 - 



5e-^ + e'^ 



□ 



Let 7o be the solution of equation 



5e- 



1-- 



;i + 8e-^). For 7 > 70 ~ 2.00492, the 



maximum connection cost factor is 1 + Se""^, so CS{'y) touches the curve (7, 1 + 8e~''), that 
is, its approximation factor is the best possible for the SMFLP, unless P = NP. The next 
theorem follows immediately. 
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Theorem 24. Let a ~ 2.04011 be the solution of equation 7 = 1 + 8e Then CS{a) is 
an a- approximation for the SMFLP and the approximation factor is the best possible unless 
P = NP. 

7 Concluding remarks and future works 

In this paper, we considered the SMFLP, a generalization of the E^FLP, when the square 
root of the connection cost function is a metric. Four of the best known algorithms for the 
MFLP were studied: three primal-dual algorithms [11, 17] analyzed through the so called 
factor-reveling linear programs, and one algorithm based on LP-rounding [6]. We proved 
that, when these algorithms are applied to SMFLP instances, they achieve approximation 
factors of 2.87, 2,43, 2.17 and 2.04, respectively. Also, we showed that there is no 2.04- 
approximation, unless P = NP, by extending the hardness results for the MFLP. Therefore, 
the LP-rounding approximation algorithm is the best possible for the SMFLP, assuming 
P 7^ NP. This is in contrast to the MFLP, for which there is still a small gap between the 
1.463 lower bound from Guha and KuUer [8], and the currently best known approximation 
factor, 1.488 from Li [15], that is a combination of an LP-rounding algorithm of Chudak 
and Shmoys [6] and a primal-dual algorithm of Jain et al. [11]. 

We presented a new technique for deriving upper bound factor-revealing programs, that 
can be solved by computer, as an alternative way to obtain an upper bound on the approxi- 
mation factors. This analysis allowed us to tighten the obtained approximation factors, and 
to simplify the analysis of the three primal-dual algorithms, when used for both SMFLP 
and MFLP instances. These upper bounds factor-reveling programs are analogous to the 
strongly factor-revealing linear programs presented recently by Mahdian and Yan [16], but 
were developed independently and are obtained using a different approach. We notice that 
the programs derived for the SMFLP have particular nonlinear convex constraints involving 
square roots. To obtain the upper bound factor-revealing programs, we used a cutting plane 
strategy to replace such constraints with linear constraints. 

The squared metric property captures the I2 distance function, that is widely used in k- 
means and several classification applications. In this paper, we analyzed the squared metric 
property in the context of the FLP. We hope that the analysis done here could also be 
extended to other relaxed metric cost functions, using the same techniques. Additionally, 
it would be interesting to investigate whether the techniques used here to obtain the upper 
bound factor-revealing programs for the FLP can be used in the analysis of factor-revealing 
programs of different problems. 
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A Upper Bound Factor- Revealing Program for A2 

Consider tuples (A, ji, 5^) E Rl^ and Bi = 1 + (3i + ^, d = 1 + -fi + }, Di = 1 + 5i + }- for 

^ li Pi 

1 < i < m. Using Lemma 1, we insert inequalities corresponding to these tuples, replacing 
the nonlinear constraint, and obtain zj^^ < w^^, where w^^ is given by 

w^^ = max J2'j=i otj 

s.t. / + E-=iC?i<l 

aj < aj+i V 1 < j < /c 

Tjl > Tj^i+i V 1 < J < / < 

ai < BiTji + ddi + Didj \fl<j<l<k,l<i<m . . 

Tji -dj <Xji V 1 < J < / < A; ^ ' 

ai — dj < Xji ^ i < I < j ^ k 

Y!;=ix,i<f yi<i<k 

aj,dj,f,rji>0 3 <l <k 

Xji>Q V 1 < ], I < k. 

Now, we calculate the dual of program (18) to derive the upper bound factor-revealing 
linear program. After that, we calculate its dual program (22), in order to use Lemma 1, 
and solve the upper bound factor-revealing program inserting cutting planes. We proceed 
the same way as done in Lemma 11. With similar arguments, we may see that z^'^ < z^f, 
for any t, and we assume that k has the form k = pt, for some integer t. The dual of linear 
program (18) is given in the following. 
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(19) 



4 9 

= mm 7 

m l—l k 

s.t. ai - ai_i +E EcjH + Ee,i>l V 1 < / < 

m /—I m k k 

1 - T, CiJ2 CjH - E -Dj E ciji - T, eij >0 V 1 < / < /c 
i=l J,-=l i=l j=l+l j=l 

1=1 

m 

^hi-i - bji + - E ^iCj/i >0 < j <l <k 

hi-eji>0 yi <J,l <k 

ao = ak = ki = bik = Wl <l <k 

ai,hi,eji >0 < l,j < k 

bji,Cjii > ^1 ^ ^ ~ 

1 < z < m. 

Now, we may derive the upper bound factor-revealing linear program. Let n = [^] and 
consider prime variables 7', a'l, b'ji, d^^, e'ji, h'l. We obtain a candidate solution for program (19) 
by defining: 

7 = 7', a.=pa^-(W-0(a^«L)' = " "^(^k " ,,n^ 

h'. (20) 

c,7/ = en = and /i^ = 

In the following, we apply definition (20) and calculate each coefficient expression for 
program (19). Again, notice that a; — a^-i = a'j- — a'\_^, and that — bji = {b'~. — b'y^jp. 
Also, fix variables c'^ at zero. 



m l—l k 

coeff[ai] = ai- ai^i + 5151 '^i^ + X! ^i' 

i=i j=i j=i 

m l—l q'^, pt g'.^ 

i=ij=i P j=i P 
t 



a 



> d 



m l—l q' ^ t g' 

^'i-i + J2Y.p^ + E p— 



p 



p 



— a 



i=i j'=i t- j>=i+i 

m l—l t 



i=i j'=i 



j'=i+i 
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m l—l m k k 

i=\ j=l i=l j=l+l j=l 

'm 1 m k (J^^ k g{„ 

i=i j=i P i=i j=i+i ^' i=i ^' 

ra I (J ^ va * c'- * 6-- 

>7'-E^^Ep^-EaEp^-Ep^ 

i=i j'=i P i=i V j/=i P 

m l—\ m t t 

= 7' - E E - E E - E %' > 0- 

i=i /=i i=i j'=i+i i'=i 



coeff[/]=7-E/^. = 7'-E- = 7'-Ep-^ = 7'-E/^;'>0. 

«=i 1=1 P i'=i P i'=i 



coeff[r,J = 6,, - 6,, + e,, - ^ 5, c,,, = hh^Jl + ^ _ ^ 5. "2^ > q. 

i=i ^' P i=i P 



coeff [xj/] = hi — Cj^i = — ^ > 0. 

Conjoining all constraints, the obtained upper bound factor-revealing linear program is: 



= mm 7 

rn l—l t 

s.t. ai ~ ai^i + J2 T, Cjii + T, eji>l Vl</<t 
i=ij=i 3=1+1 

m l—l m t t 

1 - J2 CiJ2 Cjii - J2 Di J2 ciji - E ezj > V 1 < / < t 

i=l j=l i=l j=l+l j=l 

7 - E /iz > 

1=1 

m 

bj,i-i - bji + eji - E BiCjii > V 1 < j < / < t 

hi-e,i>0 Vl<j,/<t 
ao = at = bu = bit = V 1 < / < t 

ai,hi,eji > VI < /, j < t 

Finally, calculating the dual of program (21), we obtain program (22). 



(21) 
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max 
s.t. 



ai < BiVji + Cidi + Didj 
rji - dj < Xji 
ai - dj < Xii 



1 Xji < f 



aj,dj,f,rji > 
Xji > 



V 1 < J < t 

V 1 < J < / < t 
Vl<j</<t, l<i<m 

V 1 < J < / < t 

V 1 < / < J < t 

V 1 < / < t 

V 1 < J < / < t 

V 1 < J, / < t. 



(22) 



B Experimental results 



In Table 1, we present computational results using CPLEX for the lower bound (column z^. 
and upper bound (column x^^) for the approximation factor of Algorithm A\. In Table 2, 
we present lower and upper bounds on the approximation factor of Algorithm A2 (columns 
z^^ and x^ ^ respectively). In Table 3, we present computational results for program (15) 
when 7j = 1.45, and the approximation factor obtained from Lemma 17. The chosen 5 is 
given by the solution of equation 7/ + ln5 = 1 + that is, 5 = (^^^^^^^'^-^y^'^)-^^!-^) . 
Figure 1 shows the trade-off between connection and facility costs approximation guarantees 
for the Algorithm ^42, and Figure 2 shows the trend of obtained factor for Algorithm Ai as 
we vary the value of 7/, when = 50. 



Table 1: Solutions of factor-revealing 
programs for A\. 



Table 2: Solutions of factor-revealing 
programs for A2. 



k 




^k 


10 


2.57261 


3.18162 


20 


2.71704 


3.01717 


50 


2.80540 


2.92579 


100 


2.83534 


2.89553 


200 


2.85034 


2.88046 


300 


2.85532 


2.87543 


400 


2.85782 


2.87292 


500 


2.85930 


2.87142 


600 


2.86029 


2.87041 


700 


2.86099 


2.86970 



k 


~A2 


^k 


10 


2.20702 


2.65131 


20 


2.30987 


2.53301 


50 


2.37551 


2.46544 


100 


2.39773 


2.44278 


200 


2.40894 


2.43150 


300 


2.41267 


2.42775 


400 


2.41453 


2.42586 


500 


2.41565 


2.42473 
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Tabic 3: Solutions of connection factor- 
revealing programs for A2, and obtained factor 
for A3. 



k 


k 


best 6 


factor 


10 


4.02931 


2.33433 


2.29772 


20 


3.64790 


2.16561 


2.22270 


50 


3.48465 


2.09159 


2.18792 


100 


3.43524 


2.06895 


2.17704 


200 


3.41127 


2.05793 


2.17170 


300 


3.40339 


2.05430 


2.16993 



Figure 1: Trade-off between connection 
and facility approximation factors. 

Trade-off between facility and connection factors 




Figure 2: Trend of the obtained balanced 
approximation factors. 
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Balancing using scaling and greedy-augmentation 
^ ' Balanced factor - 



2.2 2.4 



Best factor at 1 .45 
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Facility factor 
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